Graphics engine command FIFO for programming multiple registers using a mapping index with register offsets

ABSTRACT

A host writes graphics commands and data to programmable registers through a command FIFO that is read by a graphics controller or BitBlt engine. Rather than write an address and a data value for each register programmed, the host writes one address, one index, and several data values. The address points to an index register. The index is a mapping index word with several multi-bit mapping fields. Each multi-bit mapping field in the index identifies a register to be programmed with one of the data values. Since N bits are used for each mapping field, the mapping field can select one register in a bank of 2 N −1 registers. The registers in the bank can be programmed in any order, and registers can be skipped. Since only one index is stored in the command FIFO for programming several registers, less memory space and fewer bus cycles are required.

BACKGROUND OF INVENTION

This invention relates to graphics systems, and more particularly toaddressing of programmable registers.

Personal computers (PCs) and other computer systems have a variety ofcontroller integrated circuits (ICs) or chips that control subsystemssuch as for graphics, disks, and general system logic. Such controllerchips are usually programmable. For example, the graphics controller canbe programmed with the display resolution, such as the number of pixelsin a horizontal line, or the number of lines on a screen.Memory-controller chips can be programmed with numbers of clock cyclesfor memory accesses, so that the timing signals generated by thecontroller chip can be adjusted for faster memory chips or faster busclocks.

Advanced graphics systems often employ specialized engines, such as abit-block-transfer BitBlt engine. Graphics data and commands can bewritten to a command first-in-first-out (FIFO) by a host processor,allowing the BitBlt engine to read and process graphics data andcommands at its own pace.

The host microprocessor's address space is typically partitioned intomemory and input/output (I/O) address spaces. While a large memoryaddress space such as 4 GigaBytes (32 address bits) is provided, the I/Oaddress space is typically much smaller, perhaps only 64 Kbytes (16address bits). I/O addresses are used for accessing peripheral devicessuch as I/O ports, disk drives, modems, mouse and keyboard, and thecontroller chips. Often certain ranges of I/O addresses are reserved forcertain types of peripherals, such as graphics, disks, and parallelports. Thus the number of I/O addresses available to a peripheralcontroller chips is often limited.

Some of the programmable registers may be assigned addresses in thememory space rather than the I/O space. Since memory accesses are oftenfaster than I/O accesses, memory-mapped registers can be accessed morequickly, improving performance. Frequently-accessed registers are oftenmemory-mapped rather than I/O.

Programmable Registers FIGS. 1, 2

FIG. 1 shows a computer system with a controller chip with programmableregisters. A central processing unit (CPU) 12 is a microprocessor thatexecutes instructions in a program stored in memory 14 or in a BIOS ROM(not shown). Display 16 is controlled by graphics controller 10.Programs executing on CPU 12 can update the information shown on display16 by writing to a frame buffer inside or controlled by graphicscontroller 10. Graphics controller 10 reads lines of pixels from theframe buffer and transfers them to display 16, which can be acathode-ray tube (CRT) monitor or a flat-panel display.

Bus 11 connects CPU 12 and graphics controller 10, and includes anaddress bus and a data bus. Bus 11 may be divided into separate sectionsby buffer chips. Often a high-speed bus such as a PCI (PeripheralComponent Interconnect) or AGP (Accelerated Graphics Port) bus is usedto connect to graphics controller 10.

Graphics controller 10 includes programmable registers 20 that controlvarious features. For example, power-saving modes, displaycharacteristics, timing, and shading can be controlled by CPU 12 writingto programmable registers 20. Registers are frequently written during 3Drendering or bitblt operations.

FIG. 2 highlights an address decoder that selects a data register foraccess. A shared address/data bus is used where the address is outputduring a first bus cycle while the data is output during a second buscycle. During a first bus cycle, the CPU outputs an address on the busto decoder 31. This address is decoded by decoder 31, causing selector34 to selects one of the registers in programmable register 20 foraccess. The other programmable registers are deselected and cannot beaccessed until a new address is written to decoder 31.

In the second bus cycle, the CPU writes a data value to the bus. Thedata written by the CPU is written through selector 34 to the registerin programmable registers 20 that was selected by the address in decoder31. The CPU may also read the selected register rather than write theselected register since selector 34 provides a bi-directional data path,depending on the read/write control signal from the CPU. For the PCIbus, address decoding takes 1, 2, or 3 clock cycles and data is writtenon the fourth clock cycle. A two-cycle idle time is necessary. Thus eachPCI bus transaction requires 6 clock cycles.

The values written to programmable registers 20 are used to controlfeatures of the controller chip. For example, programmable registers 20can output a number of pixels per horizontal line, and a number of linesin a screen, to counters 38 in a graphics controller. When the number ofpixels written to the display matches the value of pixels/line fromprogrammable registers 20, then a horizontal sync HSYNC pulse isgenerated. When the number of lines counted matches the total number oflines from programmable registers 20, then the vertical sync VSYNC isgenerated. Controls for windows within a screen can likewise come fromprogrammable registers 20, such as for a movie window as described inTransparent Blocking of CRT Refresh Fetches During Video Overlay UsingDummy Fetches, U.S. Pat. No. 5,754,170 by Ranganathan et al., andassigned to NeoMagic Corp.

FIG. 3 shows standard bus cycles to program registers. During the firstbus cycle, a first address A1 is output on the bus from the CPU to thecontroller chip. Address A1 is the address of a first programmableregister. In the second bus cycle, data D1 is output on the bus from theCPU to the controller chip. The controller chip stores data D1 from thebus into the programmable register for address A1.

A second data value is written to a second programmable register duringthe third and fourth bus cycles. Address A2 is output during the thirdbus cycle while data D2 is output during the fourth bus cycle. Thecontroller chip writes data D2 to the register identified by address A2.A third data value is written to another programmable register in thefifth and sixth bus cycles. Data D5 is written to the controller chip'sregister for address A5.

Each programmable register written requires a 2-bus-cycle access wherethe address is followed by the data. The programmable registers can bewritten in any order, but the correct address must precede the datavalue in each pair of bus cycles. Data may be read rather than writtento the programmable registers by not asserting a write signal from theCPU.

Burst Access FIGS. 4, 5

High-speed busses often support higher data bandwidth using a burstaccess, ring a burst-access cycle, the address input in the first buscycle is followed by several data values input over several bus cycles.A predefined burst order is used to determine the addresses of the datavalues in the burst sequence.

FIG. 4 is a diagram of data being bursted into programmable registers.Burst decoder 33 receives a starting address A1 during a first buscycle. Selector 34 routes the data to the A1 data register inprogrammable registers 20 having the starting address (A1) in the secondbus cycle.

During the next 3 bus cycles, data values are received withoutaddresses. The addresses of these three data values are implied by theburst rules. The burst rules define the address order during burstcycles. For purely sequential burst rules, the implied addresses of thenext 3 data values are A1+1, A1+2, and A1+3. Often the burst addressesare interleaved so the addresses are somewhat mixed in order: A1+2,A1+1, then A1+3. The burst order is usually a fixed order defined by thearchitecture. Although a purely sequential burst is used as the example,other semi-sequential or interleaved burst orders may be substituted.The burst sequence is usually for sequential addresses (1,2,3,4), orsemi-sequential addresses (1,3,2,4, or 1,4,2,3, or others) in somepredefined sequence.

During the third bus cycle, burst decoder 33 causes selector 34 to routethe second data value D2 to the next data register (A2) in programmableregisters 20. Then in the fourth bus cycle, burst decoder 33 causesselector 34 to route the third data value D3 to the third data register(A3) in programmable registers 20. Finally, in the fifth bus cycle,burst decoder 33 causes selector 34 to route the fourth data value D4 tothe fourth data register (A4) in programmable registers 20.

FIG. 5 is a timing diagram of a burst access of programmable registers.In the first bus cycle, address A1 is sent from the CPU to thecontroller chip. This is the starting address of the burst access,identify the first data register to be written. In the second bus cycle,data value D1 is sent to the controller chip and written into the A1programmable register. Then in the third bus cycle, data value D2 iswritten to the A2 register. In the fourth bus cycle, data value D3 iswritten to the A3 register, while in the fifth bus cycle, data value D4is written to the A4 register. The burst can stop after four data valuesare written, or continue with data value D5 being written to the A5register.

Only the starting address A1 was written to the controller chip. Theother addresses A2, A3, A4, A5 were not sent across the bus from the CPUto the controller chip. These addresses are implied by the burst rules.

Since only one address is sent for four or more data values, more of thebus bandwidth is used for data transfers than for address transfers.This improves the efficiency of the bus, allowing data to be written tothe controller chip more quickly. Higher performance results.

The data values burst in must exactly follow the burst sequence definedby the burst rules. Data cannot be written out of order without stoppingthe burst and inputting a new address.

Non-Sequential Register Access Using Command FIFO FIGS. 6, 7

FIG. 6 shows that non-sequential programmable registers are sometimesaccessed. Often programs or software drivers only need to update some ofthe programmable register while other programmable registers are notupdated. Host 26 can write graphics commands and data to command FIFO21. For each register in programmable registers 20 that is to bewritten, two entries are written to command FIFO 21. The first entry isan address of the programmable register, while the second entry is thedata or command to be written to the programmable register.

For example, the first pair in command FIFO 21 is the pair or entriesA1, D1. Data D1 is to be written to the register at address A1. In theexample of FIG. 6, only registers A1, A2, A4, and A6 in programmableregisters 20 need to be updated. Registers A3 and A5 do not need to bewritten. Host 26 can use burst cycles to fill command FIFO 21, but thegraphics controller or BitBLt engine does not use burst cycles to writeto programmable registers 20 from read command FIFO 21, since theregisters written are out-of-sequence. Using a burst access to writeprogrammable registers 20 would require that the intervening registersA3, A5 also be written.

FIG. 7 is a timing diagram of writing to non-sequential programmableregisters from the command FIFO. Since registers A3, A5 are not beingwritten, a burst access to write the registers is not possible. Standardaddress-data cycles are used, and the data registers are programmed oneat a time.

In the first and second bus cycles address A1 and data D1 are sent tothe controller chip to program register A1 with data D1. A bus-idleperiod may follow as shown in this example.

Register A2 is programmed with data D2 in the next bus cycles, whileregister A4 is programmed with data D4 in other bus cycles. Finallyregister A6 is programmed with data D6 in the last bus cycles.

While command FIFO 21 improves efficiency of host-to-register transfers,a large FIFO may be required. Since a register address is stored witheach data entry, two entries in command FIFO 21 are needed for eachregister programmed. One address could be shared over many registeraccesses using a burst access if all registers in a sequence wereaccessed, but often registers are not programmed in the sequential burstorder. Sometimes only a relatively few registers are written. When evenone register in the burst sequence is not written, then burst access maynot be possible.

What is desired is more efficient use of a command FIFO to accessprogrammable registers. It is desired to access programmable registersthrough a command FIFO without storing separate addresses for eachregister. It is desired to access registers that are not in a sequentialburst-sequence order. It is desired to program only a subset of theregisters in a sequence while still sharing register address entries inthe command FIFO. A more efficient method to access non-sequentialprogrammable registers is desired.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a computer system with a controller chip with programmableregisters.

FIG. 2 highlights an address decoder that selects a data register foraccess.

FIG. 3 shows standard bus cycles to program registers.

FIG. 4 is a diagram of data being bursted into programmable registers.

FIG. 5 is a timing diagram of a burst access of programmable registers.

FIG. 6 shows that non-sequential programmable registers are sometimesaccessed.

FIG. 7 is a timing diagram of writing to non-sequential programmableregisters from the command FIFO.

FIG. 8 is a timing diagram of an indexed write to non-sequentialprogrammable registers.

FIG. 9 is a diagram of an index for mapping which of the programmableregisters are to be written during a burst.

FIG. 10 illustrates register programming using a mapping index ratherthan separate addresses.

FIG. 11 is a diagram of a mapping-index decoder that determines a numberof registers to be programmed.

FIG. 12 is a diagram of a register programmer that uses a mapping indexto select registers for programming.

DETAILED DESCRIPTION

The present invention relates to an improvement in using a command FIFOto program registers. The following description is presented to enableone of ordinary skill in the art to make and use the invention asprovided in the context of a particular application and itsrequirements. Various modifications to the preferred embodiment will beapparent to those with skill in the art, and the general principlesdefined herein may be applied to other embodiments. Therefore, thepresent invention is not intended to be limited to the particularembodiments shown and described, but is to be accorded the widest scopeconsistent with the principles and novel features herein disclosed.

The inventor has realized that the memory efficiency of a command FIFOcan be increased even when non-sequential programmable registers areprogrammed. When the first entry is for a special index register, thefollowing data entries do not need their own address entries. The firstdata value sent in the sequence is an index that the controller chip orbitblt engine decodes to determine which registers to write. Thecontroller chip then writes these registers with the remaining datavalues in the sequence stored in the command FIFO.

The software driver usually knows in advance which registers are goingto be programmed, so a list of these registers is generated by thesoftware driver. This list is translated to register fields in theindex. Then the software driver writes to the command FIFO at the firstaddress with the index, followed by the data values to written to theprogrammable registers.

FIG. 8 is a timing diagram of an indexed write to non-sequentialprogrammable registers. During a first bus cycle, the CPU sends thecurrent-entry address of the command FIFO. This is the address of thenext entry to be written in the command FIFO. In the second bus cycle,the CPU writes the index. The index appears to the computer system to bethe first data value of the burst. However, the controller chip decodesthis index to generate a map of which registers to program during theburst.

In the third bus cycle, the CPU writes the data value D1. The data valueD2 is written to the command FIFO in the fourth bus cycle, and in thefifth bus cycle, data value D4 is written to the FIFO. During the sixthbus cycle, data value D6 is written.

When the controller chip reads the command FIFO, it assumes that thefirst data value is the index for the special index register. Thecontroller chip then reads the first data value and decodes it as theindex.

The controller chip then routes the next D1 data to the A1 dataregister, since the index indicated that registers A1, A2, A4, and A6will be programmed by the burst. The controller chip writes the D2 datato the A2 register. Data value D4 is written to the A4 register, whiledata value D6 is written to the A6 register in the controller chip.

A total of 6 bus cycles and 6 entries in the command FIFO are requiredfor the non-sequential burst. In comparison with FIG. 7, which required8 bus cycles and 8 entries, the bus and command FIFO usage is reduced by25%. When eight registers are programmed for each index, memory usage isimproved by 77%. Host CPU efficiency increases since more registers canbe written before a FIFO-full interrupt occurs.

After the last data value is read, the next entry in the command FIFO isanother index. The process or decoding the mapping fields in this nextindex can be repeated to program additional registers. Read and writepointers can be kept. Once the read pointer reaches the write pointer,the controller chip has read all values from the command FIFO and canstop reading.

Mapping Index FIG. 9

FIG. 9 is a diagram of an index for mapping which of the programmableregisters are to be written during a burst. A 32-bit index is shown, butother sizes may be used.

A total of eight mapping fields are contained in the index. Each mappingfield can enable writing of one programmable register. Thus up to eightprogrammable registers can be written using one mapping index.

Each mapping field in the index is four bits wide. The four bits aredecoded to indicate which register to access. For example, the fourleast-significant-bits (LSBs) are 0000 and are decoded to indicateaccess of register 0 (address A1 is 0). The next four LSB's are 0001 andare decoded to indicate access of register 1 (address A2 is 1).

Decoding other mapping fields for A3, A4, and A5 reveal that registers4, 5, and C (hex) are to be accessed. Thus registers 0, 1, 4, 5, C areto be accessed.

When the four bits in a mapping field are all ones, the mapping fielddoes not indicate access of any register. Instead, the register accessis disabled. In this example, the last 3 mapping fields (the MSB's) areeach 1111, indicating that the last 3 register accesses are disabled.Only five registers are accessed in this example.

In the example of FIG. 9, registers 0, 1, 4, 5, C are programmed. Theother registers 2, 3, 6-9, A, B, D and E are skipped and not accessed bythe burst. Thus only 5 of the 15 possible registers are accessed whenthis example's index is sent to the controller chip.

The registers are accessed in the order determined by the mappingfields. After the index is read, data values D0, D1, D4, D5, DC are sentto the programmable registers in the next 5 bus cycles. These datavalues are written to programmable registers 0, 1, 4, 5, C by thecontroller chip.

Once the first address is sent during the first bus cycle in a burst,the addresses of the other data values in the burst are irrelevant. Theregisters accessed by data written into the command FIFO are determinedby the mapping fields in the index.

FIG. 10 illustrates register programming using a mapping index ratherthan separate addresses. Entries 28 in the command FIFO include amapping index and data to be written to the registers indicated by themapping index.

The mapping index is copied to index register 32. Selector 34 selectsone of the mapping fields in index register 32 for decoding by decoder36. For example, the LSB mapping field can be decoded first, then thenext LSB field, etc.

Decoder 36 sends control signals to bus/switch 38, which selects thefirst data entry after the mapping index from entries 28. This firstdata entry is routed over internal buses to the selected register inprogrammable registers 30. The data entry can be routed to several orall registers and the individual register enabled by a control, select,write, or enable signal from decoder 36. One or more buses may be used.

As successive mapping fields from index register 32 are selected byselector 34, successive data entries in entries 28 are written toselected registers in programmable registers 30. For example, data D0 iswritten to register 0, data D1 is written to register 1, data D5 iswritten to register 5, and data DC is written to register C. Not allregisters have to be written, and the order registers are written may benon-sequential. For example, register 5 could be written before register1.

FIG. 11 is a diagram of a mapping-index decoder that determines a numberof registers to be programmed. Other decoders can be substituted. Inthis example, the mapping index must always program at least oneregister. The upper 28 bits of the mapping index, INDEX[31:4], arecompared to ones by AND gate 70. When all 28 MSB's are one, AND gate 70outputs a one to priority encoder 80, which causes mux 82 to output itshighest input, 0x0. This sets the data count to 0, indicating that onlyone register is to be programmed.

When the upper 28 bits of the index are not all ones, AND gate 71examines the upper 24 bits. When these upper 24 bits are all ones, ANDgate 71 outputs a one to priority encoder 80, which causes mux 82 tooutput 0x1, indicating that 2 registers are to be programmed.

AND gates 72-76 similarly examine smaller numbers of 4-bit mappingfields of the index. The first (most-significant) zero is thus encodedby priority encoder 80 and mux 82 to indicate the number of registers tobe programmed. When all mapping fields contain a zero, then all 8registers are to be programmed. Priority encoder 80 causes mux 82 tooutput 0x7 to indicate a data count of 8.

FIG. 12 is a diagram of a register programmer that uses a mapping indexto select registers for programming. The data count indicates the numberof register to be programmed from the current mapping index. This datacount is input to state machine 60 to determine the number of cyclesneeded to program the registers. State machine 60 generates sequences ofcontrol signals (not shown) such as bus and register enables to transferdata to and from the programmable registers.

When the first entry is read from the command FIFO, the first entrycontains the index. The first data entry is latched into index register54 through mux/shifter 52. The lowest four bits of the index from indexregister 54 are extracted and decoded by decoder 56 to enable one of the15 register-enable signals REG[E:0]₁₃ EN. Thus the first mapping fieldis decoded to select the first register to be accessed. The next dataentry read from the command FIFO is latched into data registers 58 andwritten to the selected register.

The mapping index is then shifted down by one mapping field. The mappingindex from index register 54 is fed back to mux/shifter 52, which shiftsthe index down by four bits so that the second mapping field occupiesthe lowest four bits. These new lowest four bits are then decoded bydecoder 56 to generate the next register-enable signal REG[E:0]_EN forthe second mapping field. The next data entry read from the command FIFOis latched into data registers 58 and written to the selected register.

Mux/shifter 52 then shifts the index from index register 54 down byanother four bits, allowing the third mapping field to be decoded bydecoder 56. The third register is then written or read. This process ofshifting the mapping index and accessing the register indicated by theshifted mapping field continues until all registers have beenprogrammed. The end of programming can be detected by decoder 56 findinga mapping field containing all ones, or by the data count being reachedby state machine 60.

Data registers 58 can be eliminated when no pipelining is used, oradditional levels of pipeline registers can be used. Data registers 58could be replaced with a data FIFO. The decoder and register-enablescould also be pipelined.

ALTERNATE EMBODIMENTS

Several other embodiments are contemplated by the inventor. For example,several different mapping index words may be used, each with a differentstarting address or different command FIFOs for a different bank ofregisters. The second mapping index word may refer to the next set ofregisters that are offset by an additional 15 registers or some otheroffset. When all mapping indexes in the first index are 0xF, then noregisters in the first bank are programmed. Alternately, another bankfield may be added to the index to indicate which bank is to beprogrammed. This bank field can be decoded to select one bank from amongseveral banks. The bank field could be located in a programmableregister rather than in the index, or some other means such as an I/Opin could be used to select banks. Byte, word, double-word, or otheraddressing may be used. Additional restrictions may be placed on themapping index word, such as requiring that the disabled mapping fieldsbe the MSB's or that at least one mapping field be enabled.

The address and the data may be input on separate busses rather than ashared bus, or on shared or separate signal lines. The address mayarrive slightly before the data but with some overlap. Then a separatebus cycle may not be needed to latch the address.

Different register and addressing sizes can be used. The smallestaddressable unit may be a byte, but some systems may address only16-bit, 32-bit, or 64-bit words as the smallest writeable unit. Theindex register may be 32 bits, while the data register also 32 bits. A64-bit read can also be used to read two data registers in one buscycle. Sixteen-bit programmable registers are also possible as are othersizes.

Burst cycles could be used by the controller chip when reading thecommand FIFO, depending on the bus used between the FIFO and registersin the controller chip. The controller chip or bitblt engine can havelogic to decode the mapping index and write or read the indicatedregisters, so addresses are not needed for each register access. Thecontroller chip may be a graphics engine such as a BitBlt engine, a 2Dor 3D graphics accelerator, an MPEG engine, or other kinds of engines.

Registers may be accessed by reading rather than writing usingadditional logic. The CPU has been described as using burst cycles tosend data to the command FIFO. When the CPU does not use burst cycles towrite the command FIFO, the CPU sends the index as the data with thefirst address, which is the address to the write pointer in the commandFIFO. Then the CPU sends the first data value in another cycle, butincrements the write pointer and writes to the following entry in thecommand FIFO. Subsequent writes are to subsequent address locations inthe command FIFO.

The command FIFO may be a software buffer located anywhere in memorythat is written by the host CPU and read by the controller chip. Theread pointer address can be advanced by the controller chip while thewrite pointer addresses is advanced by the host CPU as each index anddata value is written into the FIFO. Boundary addresses for the commandFIFO can also be kept and referenced to determine when to wrap pointersaround at the end of the buffer. The command FIFO could also be ahardware FIFO that uses hardware-based pointers or even shift registers.Then the host can write all index and data values to the same address,the address of the top of the FIFO.

The index register does not have to be the same width as the dataregisters. For example, a 32-bit index register with 4-bit mappingfields can be used to program up to eight of fifteen 64-bit dataregisters. A 64-bit index register can program up to 10 data registersin a bank of 63 registers using 6-bit mapping fields, and those dataregisters can be any size. Multiple index registers can be separatelyaddressed, each controlling a different bank or set of data registers.The encoding of the index word may be varied with binary, gray-code, orother encodings that identify which of the programmable registers are tobe written. The programmable registers could span several chips.

Many different I/O addresses can be used for the index and dataregisters. An indexing scheme may be used where the address is firstwritten to an index or control register, then the data is written to asingle data register and routed to the correct data register identifiedby the index. The mapping index word could point to non-consecutive dataregisters that are normally accessed by a 2-step indexing scheme, thusbypassing the index. The invention may be applied in a highly-integratedchip, such as a graphics controller integrated together with asystems-logic controller that includes the command FIFO.

The abstract of the disclosure is provided to comply with the rulesrequiring an abstract, which will allow a searcher to quickly ascertainthe subject matter of the technical disclosure of any patent issued fromthis disclosure. It is submitted with the understanding that it will notbe used to interpret or limit the scope or meaning of the claims. 37C.F.R. § 1.72(b). Any advantages and benefits described may not apply toall embodiments of the invention. When the word “means” is recited in aclaim element, Applicant intends for the claim element to fall under 35USC § 112, paragraph 6. Often a label of one or more words precedes theword “means”. The word or words preceding the word “means” is a labelintended to ease referencing of claims elements and is not intended toconvey a structural limitation. Such means-plus-function claims areintended to cover not only the structures described herein performingthe function and their structural equivalents, but also equivalentstructures. For example, although a nail and a screw have differentstructures, they are equivalent structures since they both perform thefunction of fastening. Claims that do not use the word means are notintended to fall under 35 USC § 112, paragraph 6. Signals are typicallyelectronic signals, but may be optical signals such as can be carriedover a fiber optic line.

The foregoing description of the embodiments of the invention has beenpresented for the purposes of illustration and description. It is notintended to be exhaustive or to limit the invention to the precise formdisclosed. Many modifications and variations are possible in light ofthe above teaching. It is intended that the scope of the invention belimited not by this detailed description, but rather by the claimsappended hereto.

1. A graphics engine comprising: a command first-in-first-out FIFO forstoring entries written by a host, the entries being commands and datato the graphics engine from the host, the entries including a firstentry that is an index value, and a plurality of data entries, the firstentry being written to an address of the command FIFO; an indexregister, written with the index value from the first entry; a selector,coupled to the index register, for selecting a multi-bit mapping fieldfrom the index register, wherein the index value contains a plurality ofmulti-bit mapping fields each capable of indicating a different registerin a bank of programmable registers; a decoder, coupled to receive themulti-bit mapping field selected by the selector from the indexregister, for decoding the multi-bit mapping field to indicate aselected register in the bank of programmable registers; and a switchingbus, coupled to transfer one of the plurality of data entries from thecommand FIFO to the selected register to write a data value to theselected register; wherein the plurality of data entries from thecommand FIFO are written to several different selected registers in thebank of programmable registers by the selector selecting differentmulti-bit mapping fields from the index register and the decoderdecoding different multi-bit mapping fields to cause the switching busto write the data entries to the several different selected registers,whereby the multi-bit mapping fields in the index register are selectedand decoded to write data values to the several different selectedregisters in the bank of programmable registers.
 2. The graphics engineof claim 1 wherein the index value contains at least 8 multi-bit mappingfields.
 3. The graphics engine of claim 2 wherein each multi-bit mappingfield is at least four bits; wherein the bank of programmable registersincludes at least 15 registers that can be selected as the selectedregister for programming with a data value; wherein the data entriesinclude up to eight data values for programming up to 8 of the 15registers using the entries that include the index value, whereby up toeight registers in a bank of 15 registers can be programmed with oneindex value.
 4. The graphics engine of claim 1 wherein each multi-bitmapping field is N bits; wherein the bank of programmable registersincludes at least 2^(N)−1 registers that can be selected as the selectedregister for programming with a data value; wherein the data entriesinclude up to 2^(N-1) data values for programming up to 2^(N-1) of the2^(N)−1 registers using the entries that include the index value,whereby up to 2^(N-1) registers in a bank of 2^(N)−1 registers can beprogrammed with one index value.
 5. The graphics engine of claim 4wherein the entries stored in the command FIFO include a plurality ofentry groups, each entry group containing up to 2^(N-1) data entries inthe plurality of data entries and one index entry, wherein each entrygroup can program up to 2^(N-1) selected registers using 2^(N-1)+2entries in the command FIFO.
 6. The graphics engine of claim 5 whereinwhen at least one of the multi-bit mapping fields contains a disablingvalue, fewer than 2^(N-1) selected registers the bank of programmableregisters are written with the plurality of data entries, whereby thedisabling value in a multi-bit mapping field reduces a number ofregister programmed.
 7. The graphics engine of claim 6 wherein themulti-bit mapping field contains all ones to indicate the disablingvalue.
 8. The graphics engine of claim 7 further comprising: comparelogic that compares the multi-bit mapping field to the disabling valueand disables writing the data value to the selected register when themulti-bit mapping field contains the disabling value.
 9. The graphicsengine of claim 6 further comprising: a priority encoder coupled to aplurality of compare logic that compares several multi-bit mappingfields in the index value to the disabling value, for determining acount number of multi-bit mapping fields that do not contain thedisabling value; wherein the switching bus programs the count number ofdata values into the count number of selected registers.
 10. Thegraphics engine of claim 6 further comprising: a shifter, coupled to theindex register, for shifting the index value to apply a differentmulti-bit mapping field to the decoder in different cycles, whereby theindex value is shifted over the different cycles to decode differentmulti-bit mapping fields in the index register.
 11. The graphics engineof claim 10 wherein the host is a central processing unit CPU and thegraphics engine is a bit-block-transfer BitBlt engine and the commandFIFO is a memory between the host and the BitBlt engine.
 12. A methodfor programming registers comprising: sending an address value used toaddress a first entry in a command FIFO; writing an index value to thefirst entry in the command FIFO; writing a first data value to a secondentry in the command FIFO; writing a second data value to a third entryin the command FIFO; reading the first entry from the command FIFO;reading the index value from the first entry in the command FIFO andwriting the index value to an index register; selecting and decoding afirst mapping field in the index register, the first mapping fieldhaving multiple bits that encode a first register offset that identifiesa first selected register in bank of programmable registers;transferring the first data value in the second entry from the commandFIFO to the first selected register to write the first selected registerwith the first data value; selecting and decoding a second mapping fieldin the index register, the second mapping field having multiple bitsthat encode a second register offset that identifies a second selectedregister in bank of programmable registers; transferring the second datavalue in the third entry from the command FIFO to the second selectedregister to write the second selected register with the second datavalue.
 13. The method of claim 12 further comprising: comparing a thirdmapping field in the index register to a predetermined disable value;when the third mapping field matches the predetermined disable value,programming a different bank of programmable registers by reading adifferent first entry from the command FIFO.
 14. The method of claim 12further comprising: writing a third data value to a fourth entry in thecommand FIFO; writing a fourth data value to a fifth entry in thecommand FIFO; selecting and decoding a third mapping field in the indexregister, the third mapping field having multiple bits that encode athird register offset that identifies a third selected register in bankof programmable registers; transferring the third data value in thefourth entry from the command FIFO to the third selected register towrite the third selected register with the third data value; selectingand decoding a fourth mapping field in the index register, the fourthmapping field having multiple bits that encode a fourth register offsetthat identifies a fourth selected register in bank of programmableregisters; and transferring the fourth data value in the fifth entryfrom the command FIFO to the fourth selected register to write thefourth selected register with the fourth data value, whereby at leastfour selected registers are programmed using one index value.
 15. Aprogrammable controller comprising: memory means for storing a pluralityof entries from a host, the entries including groups of entries thatinclude a plurality of data entries; wherein each group of entries hasmultiple data entries; index register means for storing a mapping indexthat is a data entry in the current group of entries; select means,coupled to the index register means, for selecting a current mappingfield from a plurality of mapping fields in the mapping index; decodemeans, coupled to the select means, for decoding multiple bits in thecurrent mapping field to identify a current register for programming;data transfer means, responsive to the decode means, for transferringone of the plurality of data entries from the current group of entriesto the current register to program the current register; and sequencingmeans for instructing the select means to select another mapping fieldfrom the plurality of mapping fields in the mapping index as the currentmapping field, the decode means decoding the multiple bits in thecurrent mapping field to identify another current register, the datatransfer means transferring another one of the plurality of data entriesfrom the current group of entries to the another current register toprogram the another current register, whereby multiple registers areprogrammed from data entries in the current group of entries.
 16. Theprogrammable controller of claim 15 further comprising an address matchmeans for causing the decode means to identify current registers from adifferent bank of programmable registers when an address corresponds toa different bank of programmable registers, wherein the address is anaddress of the command FIFO or an address field in an index entry in thecommand FIFO that contains the mapping index.
 17. The programmablecontroller of claim 15 wherein the mapping index contains a plurality ofat least eight mapping fields, wherein the select means can select up toeight different registers as the current register, whereby up to eightregisters are programmed from the current group of entries.
 18. Theprogrammable controller of claim 17 wherein registers are programmed ina sequential or in a non-sequential order wherein registers can beskipped over during programming by a group of entries.
 19. Theprogrammable controller of claim 17 wherein the registers can beprogrammed in any order.
 20. The programmable controller of claim 17wherein the memory means comprises a command FIFO for storing graphicsdata and commands to the programmable controller.
 21. An apparatusconfigured to perform a burst transfer of data values to a bank ofprogrammable registers, the apparatus comprising: a first-in-first-out(FIFO) storage unit including an index value storage location and aplurality of data value storage locations, wherein the index valuestorage location includes a plurality of storage fields, wherein each ofsaid plurality of storage fields is associated with one of the pluralityof data value storage locations, wherein each of the storage fields isconfigured to store a multi-bit destination value indicative of one ofthe programmable registers within the bank of programmable registers;and a control unit configured to route data values stored in each of theplurality of data value storage locations from the FIFO storage unit toa respective one of the plurality of programmable registers according tothe multi-bit destination value stored in the associated storage fieldof the index value storage location; wherein, for a given burst transferto be performed by the apparatus, the index value storage location isconfigured to receive a plurality of multi-bit destination values to bestored in respective ones of the plurality of storage fields, whereinthe apparatus is configured to receive multi-bit destination values onlyfor those programmable registers within the bank of programmableregisters that are to be written by the given burst transfer, andwherein the apparatus is configured to perform the given burst transferby transferring data values only to those programmable registersindicated by the received plurality of multi-bit destination values. 22.The apparatus as recited in claim 21, wherein the control unit includesa selector configured to determine an order in which data values storedin the plurality of data value storage locations are routed to theirrespective programmable registers within the bank of programmableregisters.
 23. The apparatus as recited in claim 22, wherein the controlunit includes a counter configured to store a count value indicative ofa number of data values stored in the plurality of data value storagelocations to be routed to the bank of programmable registers by thecontrol unit.
 24. The apparatus as recited in claim 21, wherein thecontrol unit includes a decoder configured to decode encoded valuesstored in the plurality of storage fields in order to route data valuesfrom the plurality of data value storage locations to the bank ofprogrammable registers.
 25. The apparatus as recited in claim 21,wherein the control unit includes a bus switch configured to route datavalues stored in plurality of data value storage locations to the bankof programmable registers.
 26. The apparatus as recited in claim 21,wherein the control unit is further configured to select and decodemulti-bit destination values stored in each of the plurality of storagefields in order to generate control information for routing data valuesfrom the bank of programmable registers to the FIFO storage unit. 27.The apparatus as recited in claim 26, wherein the control unit isfurther configured to route one or more data values from one or more ofthe registers in the bank of programmable registers to the FIFO storageunit based upon multi-bit destination values stored in a second indexvalue storage location.
 28. The apparatus as recited in claim 21,wherein each of the plurality of storage fields comprises N bits, andwherein the bank of programmable registers includes up to 2 ^(N) −1registers, wherein N is greater than or equal to
 2. 29. The apparatus asrecited in claim 21, wherein the apparatus is configured to store firstand second data values in first and second ones of the plurality of datavalue storage locations, respectively; and wherein, as part of the bursttransfer, the apparatus is configured to route the first and second datavalues to non-consecutive ones of the bank of programmable registers.30. The apparatus of claim 21, wherein, for a burst transfer that is toprogram a number of programmable registers that is less than the numberof the plurality of storage fields, the apparatus is configured suchthat those storage fields that are not to be utilized during the bursttransfer include a value that indicates that the storage field is notindicative of any of the programmable registers within the bank ofprogrammable registers.
 31. A method, comprising: storing, by a storagedevice, a plurality of data values to be written to storage locationswithin a plurality of storage locations by a transfer operation;storing, by the storage device, an index value for the transferoperation, wherein storing the index value includes storing a firstplurality of multi-bit destination location values, wherein each of thefirst plurality of multi-bit destination location values is associatedwith one of the plurality of data values, and wherein each of the storedmulti-bit destination location values indicates one of the plurality ofstorage locations, and wherein the stored index value includes multi-bitdestination location values only for those storage locations within theplurality of storage locations that are to be written by the transferoperation; performing the transfer operation by routing each of theplurality of data values from the storage device to one of the pluralityof storage locations based upon the associated multi-bit destinationlocation value in the stored index value.
 32. The method as recited inclaim 31, further comprising determining an order in which the pluralityof data values are routed to their respective storage locations.
 33. Themethod as recited in claim 31, wherein the storage device is a FIFO, andwherein the plurality of storage locations are graphics registers. 34.The method as recited in claim 31, further comprising: storing, by thestorage device, a second index value including a second plurality ofmulti-bit destination location values; and selecting and decoding, bythe storage device, each of the second plurality of multi-bitdestination location values in order to determine one or more datavalues to be routed from the plurality of storage locations to thestorage device, wherein destination location values in the secondplurality of multi-bit destination location values are indicative of thestorage locations storing data values to be routed to the storagedevice.
 35. The method as recited in claim 31, wherein each of theplurality of multi-bit destination location values comprises N bits, andwherein the plurality of destination storage locations comprises up to 2^(N) −1 registers.
 36. The method as recited in claim 31, whereinperforming the transfer operation includes routing data valuesassociated with consecutively stored multi-bit destination locationvalues to non-consecutive ones of the plurality of storage locations.37. The method of claim 31, wherein each of the first plurality ofmulti-bit destination location values is stored in a respective field ofthe index value, and wherein performing the transfer operation furtherincludes determining that the stored index value includes a field havinga predetermined value that is not indicative of any of the plurality ofstorage locations, wherein fields in the index value having thepredetermined value are not used in completing the transfer operation.38. An apparatus configured to perform a transfer operation, theapparatus comprising: a first-in-first-out (FIFO) storage unit includingan index register and a plurality of data storage locations, wherein theindex register includes a plurality of fields, each of which isassociated with one of the plurality of data storage locations, whereineach of the fields is configured to store a multi-bit destination valueindicative of one of a plurality of destination storage locationsaccessible to the FIFO storage unit, wherein, for a given transferoperation, the FIFO storage unit is configured to receive multi-bitdestination values for those data storage locations to be written by thegiven transfer operation, but not other data storage locations; andmeans for routing data values stored in the plurality of data storagelocations from the FIFO storage unit to respective ones of the pluralityof destination storage locations indicated by the multi-bit destinationvalues stored in the associated fields of the index register.
 39. Theapparatus as recited in claim 38, wherein the means for routing isconfigured to route data values associated with consecutive fields inthe index register to non-consecutive ones of the plurality ofdestination storage locations.
 40. A system comprising: a processor; afirst-in-first-out (FIFO) storage unit coupled to the processor andincluding an index register and a plurality of data storage locations,wherein the index register includes a plurality of fields, wherein eachof the plurality of fields is associated with one of the plurality ofdata storage locations, and wherein each of the fields is configured tostore a multi-bit destination value indicative of one of a plurality ofdestination storage locations; and a controller coupled to theprocessor, the FIFO storage unit, and the plurality of destinationstorage locations; wherein the controller is configured to perform atransfer operation by routing data values stored in the plurality ofdata storage locations to respective ones of the plurality ofdestination storage locations according to the multi-bit destinationvalues stored in the associated fields of the index register; whereinthe apparatus is configured to perform the transfer operation by storingmulti-bit destination values in the index register for those ones of theplurality of destination storage locations to be written by the transferoperation, but not other ones of the plurality of destination storagelocations.
 41. The system as recited in claim 40, wherein, for a giventransfer operation, the controller is configured to route data valuesassociated with consecutive ones of the plurality of fields in the indexregister to non-consecutive ones of the plurality of destination storagelocations.
 42. The system of claim 40, wherein the apparatus isconfigured to perform the transfer operation by storing a predeterminedvalue in one or more fields of the index register, wherein thepredetermined value indicates that the one or more fields are not to beused to indicate destination storage locations that are to be writtenduring the transfer operation.
 43. An apparatus comprising: afirst-in-first-out (FIFO) storage unit including an index location and aplurality of data locations, wherein said index location is configuredto store an index value having two or more fields, each of whichcorresponds to a predetermined one of said plurality of data locations,and wherein each of said two or more fields is configured to store amulti-bit storage location value specifying one of a plurality ofstorage locations external to and accessible by said FIFO; a controlunit configured to perform a transfer operation by routing data from theplurality of storage locations to the plurality of data locations,wherein said control unit is configured, for each of said two or morefields storing a multi-bit storage location value, to route data fromthe storage location specified by the multi-bit storage location valuein the field to the predetermined data location corresponding to thefield, wherein the apparatus is configured to implement the transferoperation by including multi-bit storage location values in the indexvalue for those ones of the plurality of storage locations to betransferred as part of the transfer operation, but not other ones of theplurality of storage locations.
 44. The apparatus as recited in claim43, wherein, for a given transfer operation, the control unit isconfigured to route data stored in non-consecutive ones of the pluralityof storage locations to ones of the plurality of data locationscorresponding to consecutive fields in the index value.